The Leaf Path Projection View of Parse Trees: Exploring String Kernels for HPSG Parse Selection
نویسندگان
چکیده
We present a novel representation of parse trees as lists of paths (leaf projection paths) from leaves to the top level of the tree. This representation allows us to achieve significantly higher accuracy in the task of HPSG parse selection than standard models, and makes the application of string kernels natural. We define tree kernels via string kernels on projection paths and explore their performance in the context of parse disambiguation. We apply SVM ranking models and achieve an exact sentence accuracy of 85.40% on the Redwoods corpus.
منابع مشابه
Using Lexical and Compositional Semantics to Improve HPSG Parse Selection
Using Lexical and Compositional Semantics to Improve HPSG Parse Selection Chair of the Supervisory Committee: Dr. Emily Bender University of Washington Accurate parse ranking is essential for deep linguistic processing applications and is one of the classic problems for academic research in NLP. Despite significant advances, there remains a big need for improvement, especially for domains where...
متن کاملEfficiency in Unification-Based N-Best Parsing
We extend a recently proposed algorithm for n-best unpacking of parse forests to deal efficiently with (a) Maximum Entropy (ME) parse selection models containing important classes of non-local features, and (b) forests produced by unification grammars containing significant proportions of globally inconsistent analyses. The new algorithm empirically exhibits a linear relationship between proces...
متن کاملUnsupervised Parse Selection for HPSG
Parser disambiguation with precision grammars generally takes place via statistical ranking of the parse yield of the grammar using a supervised parse selection model. In the standard process, the parse selection model is trained over a hand-disambiguated treebank, meaning that without a significant investment of effort to produce the treebank, parse selection is not possible. Furthermore, as t...
متن کاملCross-language Projection of Dependency Trees for Tree-to-tree Machine Translation
Syntax-based machine translation (MT) is an attractive approach for introducing additional linguistic knowledge in corpus-based MT. Previous studies have shown that treeto-string and string-to-tree translation models perform better than tree-to-tree translation models since tree-to-tree models require two high quality parsers on the source as well as the target language side. In practice, high ...
متن کاملExploring Syntactic Features for Relation Extraction using a Convolution Tree Kernel
This paper proposes to use a convolution kernel over parse trees to model syntactic structure information for relation extraction. Our study reveals that the syntactic structure features embedded in a parse tree are very effective for relation extraction and these features can be well captured by the convolution tree kernel. Evaluation on the ACE 2003 corpus shows that the convolution kernel ov...
متن کامل